Causal inference is the process of using assumptions, study designs, and estimation strategies to draw conclusions about the causal relationships between variables based on data. This allows researchers to better understand the underlying mechanisms at work in complex systems and make more informed decisions. In many settings, we may not fully observe all the confounders that affect both the treatment and outcome variables, complicating the estimation of causal effects. To address this problem, a growing literature in both causal inference and machine learning proposes to use Instrumental Variables (IV). This paper serves as the first effort to systematically and comprehensively introduce and discuss the IV methods and their applications in both causal inference and machine learning. First, we provide the formal definition of IVs and discuss the identification problem of IV regression methods under different assumptions. Second, we categorize the existing work on IV methods into three streams according to the focus on the proposed methods, including two-stage least squares with IVs, control function with IVs, and evaluation of IVs. For each stream, we present both the classical causal inference methods, and recent developments in the machine learning literature. Then, we introduce a variety of applications of IV methods in real-world scenarios and provide a summary of the available datasets and algorithms. Finally, we summarize the literature, discuss the open problems and suggest promising future research directions for IV methods and their applications. We also develop a toolkit of IVs methods reviewed in this survey at https://github.com/causal-machine-learning-lab/mliv.
translated by 谷歌翻译
The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet. However, it also means that users' private data may be collected by commercial organizations without consent and used to train their models. Therefore, it's important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Specifically, the noise produced by the generator for each image has the confounder property. It can build spurious correlations between images and labels, so that the model cannot learn the correct mapping from images to labels in this noise-added dataset. Meanwhile, the discriminator is used to ensure that the generated noise is small and imperceptible, thereby remaining the normal utility of the encrypted image for humans. The experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets. The results demonstrate that our method not only outperforms state-of-the-art methods in standard settings, but can also be applied to fast encryption scenarios. Moreover, we show a series of transferability and stability experiments to further illustrate the effectiveness and superiority of our method.
translated by 谷歌翻译
Offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets, which is an important step toward the deployment of multi-agent systems in real-world applications. However, in practice, each individual behavior policy that generates multi-agent joint trajectories usually has a different level of how well it performs. e.g., an agent is a random policy while other agents are medium policies. In the cooperative game with global reward, one agent learned by existing offline MARL often inherits this random policy, jeopardizing the performance of the entire team. In this paper, we investigate offline MARL with explicit consideration on the diversity of agent-wise trajectories and propose a novel framework called Shared Individual Trajectories (SIT) to address this problem. Specifically, an attention-based reward decomposition network assigns the credit to each agent through a differentiable key-value memory mechanism in an offline manner. These decomposed credits are then used to reconstruct the joint offline datasets into prioritized experience replay with individual trajectories, thereafter agents can share their good trajectories and conservatively train their policies with a graph attention network (GAT) based critic. We evaluate our method in both discrete control (i.e., StarCraft II and multi-agent particle environment) and continuous control (i.e, multi-agent mujoco). The results indicate that our method achieves significantly better results in complex and mixed offline multi-agent datasets, especially when the difference of data quality between individual trajectories is large.
translated by 谷歌翻译
在移动设备上部署机器学习模型已引起越来越多的关注。为了解决设备上硬件资源的局限性解决模型概括问题,设备模型需要通过诸如云模型的模型压缩等技术轻量级。但是,改善设备模型概括的主要障碍是云数据和设备模型数据之间的分布变化,因为设备模型上的数据分布通常会随着时间而变化(例如,用户在建议系统中可能具有不同的偏好)。尽管实时微调和蒸馏方法考虑到了这种情况,但这些方法需要进行设备训练,由于计算能力较低和设备上缺乏实时标记样品,因此实际上是不可行的。在本文中,我们提出了一个名为Metanetwork的新型任务无关框架,用于从云中生成自适应设备模型参数,而无需进行设备训练。具体而言,我们的元网络部署在云上,由元培养剂和转移器模块组成。 Metagenerator旨在学习从样本到模型参数的映射函数,并且可以根据从设备上传到云的样本生成和传递自适应参数到设备。转移剂旨在减少元烯剂的振荡,加速收敛并在训练和推理过程中提高模型性能。我们使用三个数据集评估了两个任务的方法。广泛的实验表明,元网可以以不同的方式实现竞争性能。
translated by 谷歌翻译
在存在未衡量的混杂因素的情况下,我们解决了数据融合的治疗效应估计问题,即在不同的治疗分配机制下收集的多个数据集。例如,营销人员可以在不同时间/地点为相同产品分配不同的广告策略。为了处理由未衡量的混杂因素和数据融合引起的偏见,我们建议将观察数据分为多组(每个组具有独立治疗分配机制),然后将组指标显式地模拟为潜在的组仪器变量(LATGIV),将其模拟为实施基于IV的回归。在本文中,我们概念化了这种思想,并开发了一个统一的框架,以(1)估计跨群体观察到的变量的分布差异; (2)对不同治疗分配机制的LATGIV模型; (3)插入latgivs以估计治疗响应函数。经验结果证明了与最新方法相比,LATGIV的优势。
translated by 谷歌翻译
在信息爆炸的时代,推荐系统通过促进内容探索在人们的日常生活中起着重要作用。众所周知,用户的活动性,即行为数量,倾向于遵循长尾分布,大多数用户的积极性低。在实践中,我们观察到,在联合培训后,尾巴用户的质量推荐率明显低于首席用户。我们进一步确定,由于数据有限,因此在尾巴用户上训练的模型仍然取得了较低的结果。尽管长尾分布在推荐系统中无处不在,但在研究和行业中,提高尾巴用户的推荐性能仍然仍然是挑战。直接应用长尾分配的相关方法可能有可能伤害首席用户的经验,这是不起作用的,因为一小部分具有高积极性的首席用户贡献了平台收入的一部分。在本文中,我们提出了一种新颖的方法,可以显着提高尾巴用户的建议性能,同时至少在基本模型上为首席用户提供至少可比的性能。这种方法的本质是一种新颖的梯度聚合技术,该技术将所有用户共享的常识知识分为主干模型,然后为Head用户和Tail用户个性化提供单独的插件预测网络。至于常识学习,我们利用因果关系理论的向后调整来消除梯度估计,从而掩盖了混杂因素的骨干训练,即用户的积极性。我们对两个公共建议基准数据集和一个从支撑台平台收集的大规模工业数据集进行了广泛的实验。实证研究验证了我们方法的合理性和有效性。
translated by 谷歌翻译
在域概括(DG)中取得了长足的进步,该域旨在从多个通知的源域到未知目标域学习可推广的模型。但是,在许多实际情况下,获得足够的源数据集的注释可能非常昂贵。为了摆脱域的概括和注释成本之间的困境,在本文中,我们介绍了一个名为标签效率的域概括(LEDG)的新任务,以使用标签限制的源域来实现模型概括。为了解决这一具有挑战性的任务,我们提出了一个称为协作探索和概括(CEG)的新颖框架,该框架共同优化了主动探索和半监督的概括。具体而言,在主动探索中,在避免信息差异和冗余的同时探索阶级和域可区分性,我们查询具有类别不确定性,域代表性和信息多样性的总体排名最高的样品标签。在半监督的概括中,我们设计了基于混音的内部和域间知识增强,以扩大域知识并概括域的不变性。我们以协作方式统一主动探索和半监督概括,并促进它们之间的相互增强,从而以有限的注释来增强模型的概括。广泛的实验表明,CEG产生了出色的概括性能。特别是,与以前的DG方法相比,CEG甚至只能使用5%的数据注释预算来实现竞争结果,并在PACS数据集中具有完全标记的数据。
translated by 谷歌翻译
协作多代理增强学习(MARL)已在许多实际应用中广泛使用,在许多实际应用中,每个代理商都根据自己的观察做出决定。大多数主流方法在对分散的局部实用程序函数进行建模时,将每个局部观察结果视为完整的。但是,他们忽略了这样一个事实,即可以将局部观察信息进一步分为几个实体,只有一部分实体有助于建模推理。此外,不同实体的重要性可能会随着时间而变化。为了提高分散政策的性能,使用注意机制用于捕获本地信息的特征。然而,现有的注意模型依赖于密集的完全连接的图,并且无法更好地感知重要状态。为此,我们提出了一个稀疏的状态MARL(S2RL)框架,该框架利用稀疏的注意机制将无关的信息丢弃在局部观察中。通过自我注意力和稀疏注意机制估算局部效用函数,然后将其合并为标准的关节价值函数和中央评论家的辅助关节价值函数。我们将S2RL框架设计为即插即用的模块,使其足够一般,可以应用于各种方法。关于Starcraft II的广泛实验表明,S2RL可以显着提高许多最新方法的性能。
translated by 谷歌翻译
虽然注释大量的数据以满足复杂的学习模型,但对于许多现实世界中的应用程序可能会过于良好。主动学习(AL)和半监督学习(SSL)是两个有效但经常被隔离的方法,可以减轻渴望数据的问题。最近的一些研究探索了将AL和SSL相结合以更好地探测未标记数据的潜力。但是,几乎所有这些当代的SSL-AL作品都采用了简单的组合策略,忽略了SSL和AL的固有关系。此外,在处理大规模,高维数据集时,其他方法则遭受高计算成本。通过标记数据的行业实践的激励,我们提出了一种基于创新的基于不一致的虚拟对抗性积极学习(理想)算法,以进一步研究SSL-AL的潜在优势,并实现Al和SSL的相互增强,即SSL,即SSL宣传标签信息,以使标签信息无标记的样本信息并为Al提供平滑的嵌入,而AL排除了具有不一致的预测和相当不确定性的样品。我们通过不同粒度的增强策略(包括细粒度的连续扰动探索和粗粒数据转换)来估计未标记的样品的不一致。在文本和图像域中,广泛的实验验证了所提出的算法的有效性,并将其与最先进的基线进行了比较。两项实际案例研究可视化应用和部署所提出的数据采样算法的实际工业价值。
translated by 谷歌翻译
最近的研究表明,引入代理商之间的沟通可以显着提高合作多智能体增强学习(MARL)的整体性能。在许多现实世界的情景中,通信可能是昂贵的,多代理系统的带宽受到某些约束。占据通信资源的冗余消息可以阻止信息性消息的传输,从而危及性能。在本文中,我们的目标是学习最小的足够的通信信息。首先,我们通过完整的图表启动代理之间的通信。然后我们将图形信息瓶颈(GIB)原则介绍到这个完整的图表中,并从图形结构上获得优化。基于优化,提出了一种名为CommGIB的新型多代理通信模块,其有效地压缩了通信图中的结构信息和节点信息来处理带宽约束的设置。进行了交通管制和斯坦径II的广泛实验。结果表明,与最先进的算法相比,所提出的方法可以在带宽限制的环境中实现更好的性能,具有尤其是大型多功能机构任务中的尤其是大的边距。
translated by 谷歌翻译